224 research outputs found
Extraction of Word Set for Increasing Human-Computer Interaction in Information Retrieval
We present a mechanism that provides word sets which can make human-computer interaction more active in the course of information retrieval, with natural language processing technology and a mathematic measure for calculating degree of inclusion. We show what type of words should be added to the current query, i.e. keywords which previously had been input, in order to make human-computer interaction more creative. We try to extract related word sets with taxonomical and non-taxonomical relations from documents by employing case-marking particles derived from syntactic analysis. Then, we verify which kind of related words is more useful as an additional word for retrieval support and makes human-computer interaction more fruitful
CRL at Ntcir2
We have developed systems of two types for NTCIR2. One is an enhenced version
of the system we developed for NTCIR1 and IREX. It submitted retrieval results
for JJ and CC tasks. A variety of parameters were tried with the system. It
used such characteristics of newspapers as locational information in the CC
tasks. The system got good results for both of the tasks. The other system is a
portable system which avoids free parameters as much as possible. The system
submitted retrieval results for JJ, JE, EE, EJ, and CC tasks. The system
automatically determined the number of top documents and the weight of the
original query used in automatic-feedback retrieval. It also determined
relevant terms quite robustly. For EJ and JE tasks, it used document expansion
to augment the initial queries. It achieved good results, except on the CC
tasks.Comment: 11 pages. Computation and Language. This paper describes our results
of information retrieval in the NTCIR2 contes
Knowledge Sharing from Domain-specific Documents
Recently, collaborative discussions based on the participant generated documents, e.g., customer questionnaires, aviation reports and medical records, are required in various fields such as marketing, transport facilities and medical treatment, in order to share useful knowledge which is crucial to maintain various kind of securities, e.g., avoiding air-traffic accidents and malpractice. We introduce several techniques in natural language processing for extracting information from such text data and verify the validity of such techniques by using aviation documents as an example. We automatically and statistically extract from the documents related words that have not only taxonomical relations like synonyms but also thematic (non-taxonomical) relations including causal and entailment relations. These related words are useful for sharing information among participants. Moreover, we acquire domain-specific terms and phrases from the documents in order to pick up and share important topics from such reports
- …